Psychophysical evaluation of PSOLA: natural versus synthetic speech

نویسندگان

Reinier Kortekaas

Armin Kohlrausch

چکیده

This paper presents the results of psychophysical experiments dealing with pitch-marker positioning within the Pitch Synchronous OverLap and Add (PSOLA) framework. Sustained natural vowels were PSOLAmodified in fundamental frequency. The experiments were aimed at determining the auditory sensitivity to (1) deterministic shifts of either all or single pitch markers within a sequence, and (2) random shifts of all pitch markers (“jitter”). As for deterministic shifts of all pitch markers, the results were in reasonable agreement with results obtained previously for synthetic formant signals. For deterministic shifts of single pitch markers, thresholds depended on position in the sequence. Detection thresholds for jittered shifts were comparable to thresholds for detecting jitter in pulse trains. The ranking of the thresholds for these three conditions indicated that the auditory system is more sensitive to dynamic (modulation) cues rather than to static (timbral) cues arising from shifts in pitch-marker positioning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of the degradation of French vowels induced by the TD-PSOLA algorithm, in text-to-speech context

In concatenative speech synthesis systems, synthetic speech is obtained by concatenating acoustic units selected from a database of natural speech. The duration and fundamental frequency (F0) of the selected units are usually different from those requested by a prosodic model, and so some prosodic modification must be applied to the units in order to obtain the desired target. TD-PSOLA is an ef...

متن کامل

Concatenative Speech Synthesis: A Review

The primary objective of this paper is to provide an overview of existing Concatenative Text-To-Speech synthesis techniques. Concatenative speech synthesis can be broadly categorized into three categories, Diphone Based, Corpus based and Hybrid. Diphone based speech synthesis relies on different signal processing techniques such as PSOLA, FD-PSOLA etc. These signal processing techniques introdu...

متن کامل

Evaluation of a Multilingual Tts System with Respect to the Prosodic Quality

Improving the naturalness of synthetic speech is an essential task in developing a text-to-speech (TTS) system. Mainly, it depends on the quality of the prosody model which is utilized in the TTS system. For our TTS system called DreSS (Dresden Speech Synthesizer), we compared three different methods for generating the F0 contour to each other as well as to other synthesizers. Natural speech sa...

متن کامل

A hybrid method oriented to concatenative text-to-speech synthesis

In this paper we present a speech synthesis method for diphonebased text-to-speech systems. Its main goal is to achieve prosodic modifications that result in more natural-sounding synthetic speech. This improvement is especially useful for emotional speech synthesis, which requires high-quality prosodic modification. We present a hybrid method based on TD-PSOLA and the harmonic plus noise model...

متن کامل

A new synthesis algorithm using phase information for TTS systems

New speech synthesis algorithms capable of flexible prosody (es pecially F0) modification are desired for a high quality TTS syst em. TD-PSOLA is the most popular synthesis algorithm. The al gorithm shows very high quality when F0 modification is limite d. However, the quality degradation due to pitch epoch detection error becomes severe as the F0 modification factor becomes lar ge. On the othe...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1997

Psychophysical evaluation of PSOLA: natural versus synthetic speech

نویسندگان

چکیده

منابع مشابه

Analysis of the degradation of French vowels induced by the TD-PSOLA algorithm, in text-to-speech context

Concatenative Speech Synthesis: A Review

Evaluation of a Multilingual Tts System with Respect to the Prosodic Quality

A hybrid method oriented to concatenative text-to-speech synthesis

A new synthesis algorithm using phase information for TTS systems

عنوان ژورنال:

اشتراک گذاری